Last time, we discussed the high availability and fault tolerance Cassandra (C*) provides automatically and transparently. These are invaluable features and are only possible because C* sacrifices strict consistency. Today, we’ll look at C*’s eventual consistency model, and how applications can adapt and leverage it.
With C*, clients specify how important consistency is on a per request basis. This is expressed via a consistency level, attached to each query and mutation. The consistency level determines at what point the client is notified that the request succeeded. Available levels are listed here, but we’ll focus on just three today:
Level | Read | Write |
ONE | Return data from nearest replica | Return when one replica commits data |
QUORUM | Return most recent data from majority of replicas | Return when majority of replicas commit data |
ALL | Return most recent data from all replicas | Return when all replicas commit data |
Let’s discuss writes first. A write is always sent to all replicas responsible for the data in the mutation, but the consistency level specifies how many replicas must commit the write before the client’s request is fulfilled. This means a client may consider a write “complete” when some replicas haven’t yet committed the data! For reads, the consistency level determines how many replicas respond to the read request, and the coordinator node returns the most recent data of the queried replicas. (See our previous entry which discusses coordinator nodes)
This brings us to the crux of eventual consistency. Since all nodes don’t necessarily have the most recent value for a piece of data, and since all nodes aren’t necessarily queried on a read, it is possible for clients to receive old data from a read. If this were the only factor, everyone would always use consistency level ALL and be done with this eventual consistency nonsense. If only life were so easy. Let’s see how consistency levels impact performance.
ccm start
ccm node1 cqlsh -f <this_file>
ccm node1 cqlsh
use demo;
tracing on;
consistency ONE;
SELECT * FROM testcf;
consistency QUORUM;
SELECT * FROM testcf;
consistency ALL;
SELECT * FROM testcf;
The tracing
command allows us to see the path each request takes, returning the time to fulfill the request in microseconds. The consistency
command changes the consistency level for future requests through cqlsh. Then you should see a dramatic change in response time as the consistency level increases. In my case my times for ONE, QUORUM, and ALL respectively were 9252, 19399, and 22784 μs. Since we have a replication factor of three, each higher level hits one additional node. Greater replication factors, and a production data load, would show an even more dramatic impact. Note that caches Cassandra has can affect these times, but they still show an accurate trend.
Consistency levels don’t just impact response time – they also impact availability. Exit cqlsh and take one of our five nodes down with ccm <node> stop
, connect to a remaining node, and execute SELECT * FROM testcf;
with all three consistency levels.
With one node down, and a replication factor of three, at least two replicas for any data are still up, allowing ONE and QUORUM to succeed. ALL however requires all three replicas to respond, thus one node going down causes the request to fail! If you were to try taking another node down and repeating the test only ONE would succeed.
Cassandra provides strong consistency when R + W > N, where R and W are the number of nodes hit by the read and write requests respectively, and N is the replication factor. But it sacrifices its performance and availability to do so. There is no way around this trade off. It’s not all doom and gloom though. Edward Capriolo’s book performs some benchmarking showing for 10,000 inserts and a consistency level of ONE for read and write, only ~0.9% reached a node that hadn’t committed the write. On a second attempt, only ~0.03% still didn’t have the data. Further, C* allows different requests, with different consistency requirements, to be handled independently. Requests for which throughput is of maximum concern can use ONE. Finally, read and write levels can be different, if throughput for those two cases differ for your application. Maybe eventual consistency isn’t so bad after all…
This blog concludes our Meet Cassandra series. We hope you enjoyed getting to know Cassandra with us and learning about this exciting technology that is gaining considerable traction in our industry. There’s still plenty more to learn about Cassandra – virtual nodes, hinted handoff, read repair, and secondary indexes to name a few – but we hope this series gave you a solid foundation to build upon in your future with Cassandra!